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[57] ABSTRACT 

Disclosed is a system and method which systematically 
diagnoses emissions test failure by applying the rules of a 
knowledge base to predict the cause of vehicle emissions 
failures. Classifiers are used to form predictions. The clas- 
sifier is the data structure used in the automobile emission 
testing inspection lane by the lane diagnostic subsystem to 
provide a diagnosis for a particular vehicle. Its output is the 
likelihood that a vehicle suffers from a given failure based 
on the values of characteristics such as its emissions test 
results and the vehicle's description. The classifier predic- 
tions are then used to prepare a failure report that is given to 
the motorist for use by his or her repair technician. In 
another feature of this invention, the classifiers are continu- 
ously updated in a learning process based on new repair 
records. The learning processes periodically analyzes the 
data and updates the knowledge base to include new or 
revised classifiers. 

20 Claims, 7 Drawing Sheets 
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METHOD AND SYSTEM FOR DIAGNOSING Moreover, it would be beneficial to provide the automobile 

AND REPORTING FAILURE OF A VEHICLE owner with a prediction prior to bringing the vehicle to a 

EMISSION TEST repair technician for evaluation. 

This is a continuation of application Sen No. 08/414, 5 SUMMARY OF THE INVENTION 

925filed Mar. 31, 1995 U.S. Pat. No. 5,729,452. This invention includes the preparation of a diagnostic 

PIPT n OP TUP PPPSPMT IMVPNTTHM re P ort ^ a ^S 110 ^ assessment for a vehicle owner to 

HELD OF 1 HE PRESEN 1 INVEN HON use {n repairing his or her vchide to bring its emissions mtQ 

This invention relates to automobile emissions testing and compliance with emission standards. The diagnostic assess- 
more particularly to a system and method for predicting the 10 ment gives the vehicle owner's service technician probabi- 
cause of an automobile's failure of an emissions test. listic information about the likely causes of the vehicle's 

failure of the emissions test. The diagnosis is derived from 

BACKGROUND OF THE INVENTION operations involving a classifier table which stores previ- 

In geographical locations having poor air quality, the ouslv derived rules which form the basis for the prediction 
United States federal government has mandated vehicle 15 of me diagnostic assessment. If a vehicle which previously 
emission inspection and maintenance (I/M) programs in an failed the emission test finally passes, information relating to 
effort to enforce emission limit laws on automobile owners. me passing test is used to update the classifier table. 
The objective of these programs is to identify vehicles More particularly, a classifier of the classifier table is the 
whose emissions controls systems no longer perform accept- data structure used in the automobile emission testing 
ably and require those vehicles to receive the necessary 20 inspection lane by the lane diagnostic subsystem, which runs 
repairs and/or maintenance. The owner of a car which is on the lane controller computer, to provide a diagnosis for a 
within the allowable limits is presented with a certificate of particular vehicle. It allows a quick evaluation of the like- 
compliance. However, an owner of a car which is not within lihood that a vehicle suffers from a given failure based on the 
the allowable limits must repair the automobile so that its values of characteristics such as its emissions test results and 
emissions are within the allowable levels. 25 the vehicle's description. 

Because of the federal mandate, approximately 34 million The classifier predictions are then used to prepare a failure 

vehicles are tested annually. However, nearly 8.1 million fail report that is given to the motorist for use by his or her repair 

the test and must be repaired. It is estimated that $975 technician. The diagnosis reached by the system will be 

million dollars are spent in parts and service sales in 3Q uploaded and stored on a central database server computer 

repairing vehicles to bring them into compliance with fed- for purposes of reporting, correlation with actual repair, and 

eral emission standards. inclusion in the knowledge base. 

A vehicle owner presented with a non-compliance report In another feature of this invention, the classifiers are 

typically will engage an automobile repair service provider continuously updated in a learning process based on new 

to bring the vehicle into compliance. However, because of 35 repair records. The learning process periodically analyzes 

the number of different types of vehicles and models, it often the repair data and updates the knowledge base to include 

difficult for an independent repair service provider to reli- new or revised classifiers. The learning process will explore, 

ably determine the cause of failure. For example, in one state identify and predict failures that correlate with parameter 

inspection program a sample of 10,450 initial inspection such as the following: vehicle make and model year; vehicle 

failures lead to 4,400 re-inspection failures, such indicating 40 mileage; on-board-diagnostics (OBD) data; emissions com- 

a forty-two (42%) failure to repair. The retest failure of posite values; and emissions second-by-second values, 

forty-two percent 42% of 8.1 million failed vehicles is ^ learning process can ^ described m terms of its 

equivalent to 3.4 million that must be repaired further and mputSj outputs and functions, inputs to the learning 

tested a third time or deemed ehgib e for a waiver if the process utility are suitably prepared data from the foUowing: 

repair costs of that particular vehicle exceeded statutory 45 vehicle test records; vehicle em issions repair records; and 

mi ' diagnostic records. The outputs to the learning process 

The cost to vehicle owners for unsuccessful repairs as utility are for example: new classifiers; learning process log 
well as to the air quality for continued excessive emissions entry; administrative report; and a pattern report. The gen- 
is very high. Moreover, even in automobiles which are able e ral functions of the learning process are to describe the 
to pass, oftentimes their reported emission measurements 50 data, determine patterns of significance, and create a clas- 
are close to the limits allowable by law and thus could sification data structure (classifier) and mechanisms for 
benefit from lowering. It would be beneficial for the vehi- applying the classifier in a predictive mode. The predictive 
cle's regular service technician to service the car in a manner accuracy of the classifier is evaluated periodically using a 
which he or she knowingly could improve emission levels in dataset representative of current program vehicles. The 
such a case. Thus, it would be beneficial if a testing facility 55 classifier is updated as needed to maintain or improve 
were able to provide an analysis of causes of emissions that accuracy, 
are either close to or over legal limits at the same time the 

vehicle owner is presented with a emissions test report. BRIEF DESCRIPTION OF THE DRAWINGS 

In some states, hot-lines exist for automobile repair ser- FIG j depicts hardware elements of the this invention; 

vice providers to call for help in diagnosing test failure 60 nr^o ^ ir u r j . 

^. u T2 ♦ * .1 #u • * 1. ■ LaAAUlw ou FIGS. 2a-2f are graphs of emissions and purge test 

results. Experts talk with service providers to brainstorm a results- F 6 

solution to the emission problem. However, with each _ ' „ . , . . 

vehicle, there are many variables to consider, including FIGS ' * a ~* d show m exam P Ie of a P redictl0n re P° rt i 

multiple emissions category failures. Therefore, it is desir- FIG - 4 * is a diagram of the method of this invention; 

able to systematically diagnose failure to provide a relatively 65 FIG. 4b is a legend of the diagram of FIG. 4a; 

reliable and accurate prediction of the type of repair which FIG. 5 is a systematic diagram of the learning process 

would bring the vehicle into compliance with emission laws. feature of this invention; 
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FIG. 6 is an example of a CHAID or similar algorithm tree information about the levels of emissions with respect to the 

output; and allowable limits set by law. The failure report is delivered in 

FIG. 7 is a systematic diagram of the lane controller with aDV format, for example, written or electronic, 

the diagnostic assessment generation. In the system of the this invention, the processor 26, 

5 provides a diagnostic assessment 32. In a first situation, the 

DETAILED DESCRIPTION OF THE diagnostic assessment is provided in the event of a failure of 

INVENTION a vehicle to pass emissions test. 

This description is broken down into three distinct sec- At the start of the vehicle inspection where a vehicle is 

tions. The first section describes the emissions testing pro- being retested, vehicle inspection personnel either enters 

cess in general with reference to the initial diagnostic 10 repair data 46 to processor 26 or it is scanned in and up to 

assessment feature of the this invention. The second section the host data base server 28 for input using various other 

describes the learning process feature of the this invention means. Since the vehicle has failed a previous test, that is 

which uses among other things, data emissions testing this current test is a retest, the performed repair data 46 is 

information generated from retests of previously failed surrendered and is entered at an appropriate time, 

vehicles to update the classifiers used to make initial diag- 15 A repair data form scanning system that completely 

nostic assessment. The third section ties the elements of the automates the task of reading and evaluation the information 

first section and the second section together with reference collected from vehicle repair reports is preferred. In a 

to the interaction of this invention with the operations of an situation where the inspection personnel enters the data the 

inspection lane. ^ console display 36 provides prompts and messages to the 

As an introduction, the following is a discussion about the inspector and permits entry of responses and data, 

features of an emissions inspection system. Generally, any Preferably, data entered into the system is thoroughly 

of a number vehicle emissions testing regimes (or checked for errors before being accepted, 

procedures) can be used and this invention is not limited to Controller software 27 causes the emissions data or 

any one of them. Examples include, two-speed idle, loaded ^ similar data shown in FIGS. 2a-2/ to be formatted in a 

steady-state, ASM 50-15, ASM 2525, ASM 2, and I/M 240. manner so that it can be compared to the classifiers stored in 

I/M 240 will be used as an example of to illustrate aspects classifier table 41. Comparator 42 runs an algorithm so that 

of the data collection feature invention. As mentioned above, processor 26 generates diagnostic assessment 32 for an 

the I/M refers to inspection maintenance. The 240 of I/M individual vehicle. 

240 refers to the 240 seconds in which data is collected. 3Q The algorithm to evaluate each vehicle using the classifier 

Other emissions test systems are equally applicable. FIG. 1 table is preferably computationally economical. The classi- 

shows a system which is used to test for certain emissions fi e r is a set of data structures — one for each failure to be 

during the I/M 240 test. The emission analysis system 10 diagnosed. In one embodiment of this invention, each failure 

uses a tube 22 to collect exhaust from the tailpipe of diagnosed is independent of the others since there may be 

automobile 12 to test for HC, CO, C0 2 , and NO^ read by 35 multiple failures for a single vehicle, 

analyzers 13, 14, 16 and 17 (or what ever emissions collec- Each data structurc is a series of rules that can be applied 

tion is desired). Emission analysis system 10 also includes to mc vehicle population in the form of "if." Each vehicle 

other typical features such as flowmeters 18, calibration has ODC ^ only onc applicable rule per data structure. The 

gases 19, an exhaust pipe to the roof 21. algorithm then, for each data structure (or failure), compares 

As is true with all test regimes, the I/M 240 emissions ^ the vehicle's parameters with those in the first rule. If the 

analyzer system is controlled by a software/hardware com- parameter values don't match, the algorithm goes to the next 

bination and is in communication with the lane controller rule. As soon as a matching rule is found, the probability that 

processor 26. During the I/M 240 driving cycle, the I/M 240 corresponds with the parameters is provided and the data 

emissions analyzer system transmits mass emissions data to structure is exited. Thus, through parsing, the failure analy- 

a processor at a once-per-second rate. Each grams-per- 45 sis feature of this invention matches emissions result to the 

second reading time-stamped and transmitted to the repair diagnosis (thus providing real-time analysis as 

processor, which calculates the resultant second-by-second opposed to batch-calculations). 

grams-per-mile results. Each grams-per-second results also The classifier table is a data structure used in the 

includes a status byte that flags systems failures, out-of- knowledge-based system and is made up of rules that can be 

range conditions, and communications errors so that the 50 applied to make predictions. The rules represent leaf nodes 

processor can be signalled to take immediate action. 0 f a decision tree. Methods of induction of decision trees 

The processor 26 performs all of the described functions from suitable empirical data have been studied by artificial 

and is usually in communication with a central data base intelligence and statistical researchers since the late 1970's. 

server host processor 28. Usually, the testing facility runs The tree generation is provided by a commercially available 

several test lanes. In other situations, the test facility oper- 55 program such as KnowledgeSEEKER(tm) by Angoss Soft- 

ates a single emission analysis system 10. Each lane is ware which uses a CHAID or a Chi 2 Automatic Interaction 

equipped with a processor 26 which supports the execution Detection algorithm or by a variant of ID3 which was 

of controller software 27 which manages the activities in the devised by J. Quinlan, published in "Machine Learning, 

lane including the storage of the emissions data. Then, using "1986. Tree generation output, which is one element of the 

the proper weighting factors, it calculates the total values, 60 update process, will be discussed in detail below. The 

which are compared to the appropriate I/M 240 (or other test preparation of the raw data into input to CHAID is uniquely 

regimes) exhaust standards to determine pass or fail. determined by an initial analysis in the detailed design and 

The processor 26 generates a failure report 31 indicating implementation phase and is described in the second section 

that the vehicle has exceeded the legal limit of one or more of this detailed description of this invention, 

chemical emissions. Turning to FIG. 2, illustrates the con- 65 The output file is modified to form the classifier as 

tents of the failure report 31, that is the raw data generated described in detail below. The rule files are inputs to the 

from the emission test, of FIG. 1 to give the vehicle owner classifier formatting module (see FIGS. 4b and 3d below). In 
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other words, a classifier is a rule file that has been refor- 
matted and optimized to be useable by the failure analysis 
module and the classifier table is a collection of one or more 
classifiers. 

Once a repair diagnosis has been made, the diagnosis 32 
may be ordered by listing the most likely cause first or by 
associating a probability with each one, depending on the 
source of the items and whether the probability data is 
desired. For example, the learning process, which is dis- 
cussed in detail below, will identify problems that frequently 
occur. This allows the probability to be calculated and 
included in the knowledge base and diagnosis. 

FIGS. 3a-d combined show an example of a repair 
strategy report providing diagnostic assessment 32 (see FIG. 
1) as output of the classifier table. For a 1984 Nissan truck 
with an idle H0629, idle CO-8.49 and idle 0 2 =9.7 two 
failure categories (ignition failure probability (FIG. 3a) and 
air induction failure probability (FIG. 3c)) are generated 
using characteristics of the vehicle and emissions data which 
satisfy the classifier table. 

Specifically, the air induction failure shown by the com- 
bination of FIGS. 3c and 3d are satisfied by rule 10 below. 
In processing processor 26 matches vehicle make and model 
year; vehicle mileage; emissions composite values; and rules 
of the classifier table. The algorithm processes the rules in 
the classifier table 41 to pull out predictors that match 
vehicle and test data and associated failure probabilities as 
shown in FIGS. 3a-d. These figures were demonstrated 
using data from a two-speed idle test. 

Classifier format modules shown in FIGS. 36 and 3d 
identify predictors for the failures probabilities shown in 
FIGS. 3a and 3c. The graphs show the probability of a 
problem in the repair category and how this vehicle com- 
pares with other failed vehicles for the repair categories. The 
rules above create the classifiers which form the classifier 
format modules of FIGS. 3b and 3d. 

In FIG. 36, the predictor categories of HC and CO at idle 
correlate with ignition failure. A value of idle HC which is 
between 346 & 786 and the idle CO which is greater than 
1 .27 indicates a slightly reduce probability of ignition failure 
over the average vehicle. 

In FIG. 3a the ignition failure probability is shown with 
regard to all failing vehicles (47%) and with regard to this 
vehicle (45%). Similarly, in FIG. 3d, three predictor catego- 
ries are shown which present air induction failure symp- 
toms. In FIG. 3c the air induction failure probability is 
shown with regard to all failed vehicles (17%) and with 
regard to this vehicle (33%). 

The repair categories most likely to be responsible for the 
failure of the 1984 Nissan are presented in order of descend- 
ing probability. Alternatively, predictor percentage ranges 
may be mapped to English language descriptions, such as 
high, moderate and low. From FIG. 3c it can be seen that 
there is an elevated likelihood that the 1984 Nissan truck 
will have an air induction failure compared to that particular 
failure with respect to all failed vehicles which is extremely 
useful information for a repair technician in repairing the 
vehicle. 

Below is a listing of potential failure categories and 
subcategories which reflect groups of repair actions that 
exhibit similar symptoms. These are subject to change in 
size and content depending on the learning process perfor- 
mance discuss below. Subcategories are lowest level of 
information. The level of information provided as a diag- 
nostic assessment is dependent upon the correlations which 
can be drawn during the learning process discussed below. 


15 


20 


25 


30 


35 


40 


45 


50 


55 


60 


65 


This is also constrained by the repair actions, the lowest 
level of detail given on the vehicle emissions repair report. 
The failure categories and the repair actions corresponding 
to each category are for example: 


fuel_delivery 

carburetor adjustment 

speed adjustment 

carburetor 

choke 

cold start 

fuel filter 

hoses 

injector cleaning 
injector (s) 
inlet restrictor 
pump 
regulator 

motor/valve/solenoid 
tank 
air injection 
belt 

check valve 
control 
pump 
tubes 
valves 
ignition 

cap/rotor 
coil 

distributor 
initial timing 
module 
plugs 

spark advance control 
wires 

egr 

control system 

passage/hose 

sensor 

valve 
evaporation 

carbon canister 

control 

filter 

hoses 

gas cap 

purge valve 
catalytic converter 

converter 

heat shield 

preheat catalytic converter 
air_jnductioo 
air filter 
ducts 
sensor 

thermostatic air door 

throttle bore 

oil change HI CO 

could put in oil & coolant level 

diluted oil 

pev 

crankcase ventilation 
hose 
passage 
valve 
electronic_oontrol 
air control 

canister purge control 
coolant sensor 
ECM 

EGR control 
idle control 
MAP sensor/switch 
mass air flow sensor 
mixture control 
pressure sensor 
PROM 

RPM sensor/switch 
spark control 
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-continued 

temp sensor/switch 

throttle position sensor/switch 

vehicle speed sensor 
0 2 sensor 5 

0 2 sensor 
exhaust 

exhaust components 

manifolds 
vacuum_leak 

vacuum leak io 
engine_mech 

valve 

valve timing 


While each category listed above includes subcategories, 15 
these subcategories can include subcategories of their own. 
As the classifier table becomes more accurate, more subcat- 
egories can be addressed by the rules independently as a 
category. 

Standardization of repair information and consistency is 2Q 
preferable. To provide consistency, where the repair techni- 
cian is equipped with appropriate computer hardware and 
software, diagnostic assessments are presented 
electronically, and via dial-up phone line, for example, by 
Internet data delivery. The diagnostic assessments are also 25 
provided on a printed failure report at the inspection lane 
which the vehicle owner presents to the repair technician. 

As discussed above, the classifier table 41 has been 
previously built and stored for access during processing by 
comparator 42. Accordingly, the classifier table 41 provides 30 
the ability of this invention to "close the loop" between the 
repair mechanic and the inspection system by providing 
increasingly accurate diagnostic and repair statistics to 
increase the success rate of the repair process, bringing more 
vehicles into compliance under waiver limits. 35 

Above, mainly the failure diagnostic feature of this inven- 
tion has been described. That is, this detailed description up 
to this point has been directed to the explanation of the 
emissions testing process in general with reference to the 
initial diagnostic assessments feature of the this invention. 4 q 
Below, the learning process or update feature of the inven- 
tion is described in detail. Accordingly, the following section 
describes the learning process feature of this invention 
which uses emissions testing information generated from 
failed tests and passing retests to update the predetermined 45 
criteria used to make initial diagnostic assessments. That is, 
failed emissions test data is used, and passed retest data is 
used only to validate that repairs performed were successful. 
The retest emissions results might include information such 
as that found in FIGS. 2a-2f 50 

Looking at the overall process of this invention, including 
the update feature is provided by FIGS. 4a and 46 where 
FIG. 4a shows the system and FIG. 4b provides a legend for 
the path configurations. The vehicle 12 visits the inspection 
station 20 and receives a failure report with diagnostic 55 
assessment 31. The vehicle visits the repair facility 25 and 
receives repairs, such as those most likely including those 
suggested by the failure report 31 as discussed in detail 
above. The repair facility 25 generates a repair report 46 and 
the inspection station 20 retests the vehicle 12. That retest 60 
information 51 is sent to the host 28 along with vehicle 
emissions repair reports 52 to be gathered as part of host 
databases 53. The learning process 60 performs as described 
below and updated classifier data files are transferred 61 to 
the inspection station and processor 26. 65 

By capturing information regarding repairs 46 performed 
on vehicles that fail emissions inspections, and then retest- 


,871 

8 

ing the vehicle by emission analysis system 10, information 
is provided to processor 26 which is collected and used to 
update the classifier table during the learning process. Per- 
formed repair data 46 is input to the host 28 so that it 
corresponds unambiguously with the vehicle test results 
record 31 and diagnostic assessments 32. 

As mentioned above, the learning process may be initi- 
ated on the host or other centralized apparatus. Alternatively, 
the learning process operates in a client server mode with the 
learning process connected directly to the host database 
tables and a client application running on the PC. In this 
configuration, these functions would be implemented in a 
client application and the output could be any file formate 
acceptable to a tree building algorithm. 

The user interface that initiates the learning process 
preferably requires the following information from the user: 
data collection start date; vehicle test type desired for 
programs that have multiple test regimes, i.e. two-speed idle, 
I/M 240; and the value to be used for excluding marginal 
failures, fail_margin. 

Suitable data are selected, files are assembled and written 
out to a file for vehicle records meeting the learning process 
criteria. There are several separate types of functions per- 
formed including: creating reports that monitor the effec- 
tiveness of the learning process and the diagnostic assess- 
ments issued; filtering vehicle records for learning; 
assembling a data record in a temporary table for acceptable 
vehicles including formatting and checking failed values; 
copying the contents of the temporary table data to an input 
file for the learning process; creating additional data files for 
use in the lane diagnostic subsystem. 

Before actually discussing the learning process itself, the 
preliminary reports are discussed in that they are generated 
through the process of preparing the learning process data 
for the learning process operation. 

Turning to FIG. 5, there is shown a systematic diagram of 
how the host diagnostic system performs updates of the 
classifier table's knowledge base. The update is an ongoing 
process of "learning": the statistical module which receives 
new data 53 including actual repair data from retested 
vehicles; and data from other testing programs in the form 
of individual records (as discussed above); filtering for 
errors and weighting the data 66 according to its value or 
ordering its application so that more credible measures have 
a greater influence in forming the diagnosis; formatting and 
compressing data 67 so that it is in a form which can be 
correlated; correlating the actual repairs with the predictors 
to create rules 76; compressing and concatenating the rules 
69 to provide data structures for individual failures and 
provide compaction of the data structure; testing the com- 
pacted classifiers to determine accuracy 79; updating the 
knowledge base for distribution to all locations where it 
resides. The frequency of the updates is adjustable. The 
determination of which data to use and how to format it is 
nontrivial. In one embodiment, the OBD data is included in 
the learning process. In a different embodiment, the vehi- 
cle's OBD overrides some or all probabalistic predictions. 

Each element of the update feature as outlined above is 
now discussed in more detail. Returning first the statistical 
module 53, the statistics given here are descriptive in nature 
and are formatted and output in the repair effectiveness 
report in the form of an administrative report 54. The values 
are preferably computed for the data collection period input 
by the user to cover the learning process. These vary by 
emissions testing program and may include the following: 
number of failing vehicles broken down by type of failure 
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(standard failed) and test regime applied; number of failure 
reports 31; description of OBD operations performed and 
results, including retest success/failure rates; frequency dis- 
tribution of vehicle emissions repair report 46; frequency 
distribution of vehicle emissions repair report 46 failure 5 
categories for all failed/repair vehicles; and frequency dis- 
tribution of multiple retest repair actions (hard-to-diagnose 
repairs) by subsequent retest result (repairs made followed 
by failing retest and hard-to-fix repairs made followed by 
passing retest). 10 

Again, certain reports are generated relating to the input 
data as the data is prepared for the learning process 68. For 
example, a repair effectiveness report in the form of admin- 
istrative 56 describes the repair actions performed on failed 
vehicles. A repair effectiveness report is used primarily to 15 
help understand the distribution of repair actions in the 
failed vehicle population and to identify those repairs that 
are difficult to diagnose, marginally effective, and ineffec- 
tive. The inputs to the report are the descriptive statistics 
from the statistics module and could include input from the 20 
failure frequency distribution file 56 which uses effective, 
successful repairs only. The performance of the learning 
process 68 is best given by the learning process log 83 
(discussed below). 

The filtering function of the update or learning process 
feature of this invention, the filter module 66 filters vehicle 
records through selection criteria. Rows from the repaid 3 
data table meeting the selection criteria are put in a tempo- 
rary table. The following selection criteria apply: 

1) The filter module selects vehicle records based on a 
data collection start data calculated or input by the user. 
The default start date should precede the date of the last 
learning process by a couple of weeks to allow selec- 
tion of vehicles that were excluded previous due to an 
incomplete test/repair/retest cycle. 

(2) The filter module selects only vehicle with regular, 
documented test/repair/retest cycles consisting of the 
failed emissions test record and the vehicle's next 
consecutive passing retest with full repair information. 
The following types of data are filtered out: waived 
vehicles; test records other than the last two (failed 
followed by pass result) for vehicles with multiple 
repair/test cycles; aborted inspections; vehicles that 
failed on tampering or purge/pressure only; vehicles 
without repair or retest records, 

(3) The filter module compares initial test results with 
emission standards. Select only vehicles that failed the 
initial inspection with at least one emissions compo- 
nent that exceeded the standard by at least fail__margin 
% and subsequently passed on the next retest. Note that 
mis is the only occasion where values from the passed 
retest are examined. 

(4) The filter module selects at least row_min (i.e. 
initially 8000) rows. A stop and issue message is 55 
provided if fewer than row_min rows are available. 
Preferably, values used are empirically determined. 

(5) The filter module checks that the temporary table 
contains rows that are unordered with regard to the 
symptoms and the population of the available records. 60 
Care is taken that population of the table or subsequent 
copying of the table is not done using an index or key 
that creates such an order, even indirectly. If 15000 
rows are available and only the first 8000 are chosen, 
the population used is then skewed with respect to the 65 
ordering variable. For example, VIN would skew the 
data with regard to vehicle make via the international 
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manufacturer code; similarly, selection by plate might 
skew the data with regard to model year. Ordering, for 
example, by test date would not present these problems. 
(6) The filter module selects vehicle rows according to the 
test type under consideration. Preferably, all vehicles 
undergo the same emissions test regime. 
With regard to the next update feature element, the 
formatting and weight module 67, the inputs are rows from 
the filter module 66. The result is a temporary table, 
learning_process, filled with suitably formatted learning 
process data. This module creates a row in the learnings 
process table for each row meeting the filtering criteria. 
Below, formatting standards to be applied to column values 
are discussed. The output from the formatting and weight 
module 67 are two input files used by the learning process 
68 and two data files used in the inspection lane 20. 

The columns in the learning processes 68 table typically 
contain three different kinds of data. A first type of data is a 
vehicle description. The following columns and their values 
describe the vehicle characteristics and are taken directly 
from the repair_data table: 


make 

model_ ye „ 

vehicle_typc 

cylinders 

cc__displacement 

odometer 


These columns are the same for all test types. 

A second type of learning_process data is initial test 
results. These columns represent the vehicle's emissions test 
results and depend on the specific test regime. The following 
three blocks of data are under evaluation for the I/M 240 test 
regime. At least one block will be used. All data is from the 
initial test (before). 
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'BLOCK1 
*before_hc_phase2 

* be f ore_co_p hase2 

* bcfore_co2_phasc2 

* before_nox_p hase2 
**»*****«««***•*••• 

•BLOCK2 

* before__hc_compOB ite 

* bcfore_co_composile 
*before_co2_composUe 
*bcforc__nox_compositc 

•BLOCK3 
*hc_accel 
•hc_cruise 
*hc_dcccl 
*hc_transient 
*co_accel 
*co_cruise 
*co_deeel 
*co_transicnt 
*co2_accel 
*co2_cruise 
•co2_deccl 
*co2_transient 
*no*__accel 
•nox^cruise 
*nox__decel 
•nox_transient 


* "SOURCE: rcpair_data tabic* 


♦SOURCE: repaii_data table* 


****[/M 240 second by second*** 


Blocks 1 and 2 values are immediately available from the 
repair_data table. Block 3 values are derived from I/M 240 
second-by-second data by summing the emissions compo- 
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I/M 240 Mode Definition 




From (seconds) 

To (seconds) 

Mode Number 

Mode 

sO 

si 

1 

Accel 

0 

15 

2 

Transient 

15 

54 

3 

Cruise 

54 

79 

4 

Dcccl 

79 

93 

5 

Accel 

93 

106 

6 

Transient 

106 

156 

7 

Accel 

156 

187 

8 

Cruise 

187 

200 

9 

Decel 

200 

239 
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nents values over four different driving modes. These modes 3. Optional fields should default to NULL, 
are acceleration (accel), cruise, deceleration (decel), and The output files as described above that is, those that have 
transient. The I/M 240 accel, cruise, decel and transient been filtered, formatted and weighted are input to a decision 
modes are identified in Table 1. tree system using an algorithm such as CHAID 76 described 

5 above. The final step before input to CHAID is creating 
TABLE 1 output data files for use in tree building step. 

Four output files are created including the Repair Action/ 
Category File documents, the specific Repair Actions and 
their respective Failure Categories used in the Learning 
Process. The Repair Actions are shown on the vehicle 
emissions repair report. The failure frequency distribution 
file 56 documents the relative frequency of a failure category 
in the failure population. This file contains two values for 
each failure category: the name of the failure category and 
the percentage of the vehicles in the temporary table exhib- 
iting the failure. 

Two output files are created which contain data from the 
temporary file. Half of the rows should be placed in a file 
with extension .tra to be used for training purposes 73. The 

. . . , ™ , - , , r , , „ n other half are put into a file with extension .tst to be used for 

Values for Block3 columns are the sums of the stored 20 ?4 ^ ^ q£ ^ ^ CHAJD 

emissions component values over all the relevant time steps al ithm 76 te ' dc]imilcd format files Assume 

in the mode. For example, the hc_accel value for a given & , ^r^. t . Ct , , , tL 

, , ; . ■ t i r . * , a comma delimiter. The creation of these files depends on the 

vehicle is the sum of he values recorded over the accelera- l4 . , , . , JT pu r i.llh 

tion modes 1 5 and 7 or relational database used. In Sybase, for example, the bulk 

25 copy utility can be used. Alternatively, the algorithm could 
sl be encoded as a relational database procedure. 

Ac_acce/ » a f ee i 3 ^ Q fK ^ 1 Once training 73 and testing 74 files are available on the 

host, they can be transferred to the learning process 68. The 
t , K t CHAID algorithm 76 import format files are directly 

where hc(s) is the he second by second value at time s. One imported as ASCII files with an optional import format file, 
may also want to throw out mode 1 and evaluate the 30 0 ne import format ^e is required for each failure category, 
potential of tje transient modes. The mode definitions are ^ e format fiks faave me ^ name ^ ^ failure 

5 Wspeed idle testing results in two values for each ^ t ^j~ f file !° 

emission component HC, CO, and C0 2 : one at curb idle and dehmiters ' the fie J d ° ames > and ^ status of each 

one at 2500 rpm. In addition, the engine rpm value at idle is 35 field ™ e status °P tlons and del ™ ltei * arc - 
used. 

The third type of data is failure categories. Additional E Erase-Igaore data in this field 

fields record failure categories for repairs made to the D Dependent variable - the failure category to be analyzed and predicted 

vehicle. Repair actions from the repair 13 data table are I Independent variable - the fields containing variables that may be 

mapped to failure categories. Categories reflect groups of 40 predictors 

repair actions that exhibit similar symptoms. This mapping — ^ ^ 

is recorded in the Repair Action/Failure Category file. A Each format file has the failure cale gory under consider- 

copy of this file is transmitted to each inspection lane 20 at j on mar ked with a D for dependent variable. All other 

where it is kept on the lane controller computer. Failure failure categ ory fields are marked with E options (they are 

categories are subject to change, depending on the learning 45 not under consideration and no a priori knowledge is 

process performance. Preferably, changes are documented in available). All other fields are marked with the I option, 

the learning process log 83 and as well as the documentation The import files and structure of the CHAID algorithm 76 

for this system and the Repair Action/Failure Category file mns are described here. Another implementation of this 

used in the inspection lane 20. algorithm or a similar algorithm would have similar methods 

The failure categories and the repair actions correspond- 50 f or identifying the independent and dependent variables and 

ing to each category are listed above. These columns are of for identifying the variable types, i.e. categorical (ordered or 

datatype bit with values either 1, signalling that one or more unordered) or continuous. 

repair action was made in the category, or 0, signifying that failure prediction problem and its solution are for- 

no actions were taken. Actions that were recommended but mutated to assume that failures are independent of each 

not performed are not included. 55 other due to the high number of multiple failures and the 

Another characteristic of the data is value standardization. lack of a priori knowledge about the existence of other 

That is, table values are checked as follows: failures< Each ^ the tree bu iiding is run, the presence or 

1. Numerical fields do not contain unreasonable values absence of a failure in a single failure category is examined, 
that would skew the analysis results. Acceptable A tree structure is created which represents statistically 
bounds for these depend on the field. Records with 60 significant relationships between predictor category values 
nonsensical or outlying field values may be deleted and the failure category under consideration. The existence 
from the analysis or have the field reset to the nearest 0 f other kinds of failures is suppressed through the import 
acceptable value, respectively. format file. 

2. Categorical fields, such as Make, match a valid Make The CHAID algorithm 76 runs are made using the import 
value. Because learning process data or classifiers may 65 format files 77 and the training data files 73 (file extension 
be shared across programs, the categorical fields .tm) as described above. The output from CHAID for each 
remain consistent regardless of agency requirements. run is a single rule file 78. A rule file contains a variable 
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number of production rules that describe the leaf nodes of 
the tree structure. Each rule describes a subset of the general 
vehicle population for which the likelihood of the given 
failure is significantly different from the general population. 
A rule is a conditional statement of the probability of the 
particular failure for a well-defined group of vehicles. The 
conditions that define the group of vehicles are values for 
one or more of the predictor (independent) variables. Only 
predictors that significantly correlate with the given failure 
are present in the rule file. 

An excerpt of a rule file for the AIR_INJ failure category 
is given below (note that these are not the same category 
names as given above). Although each rule has at least one 
IF condition, a rule may have one or two probability 
statements. 


RULE_1 IF 

MAKE « AC, ACURA, ALFA, AM GE, AUSTL HYUND, LOTUS, 
MrrSU, OPEL, PANTE, SAAB, SUBAR, SUNBB, SUZUK or VOLKS 

HIGH_02 - [-5.4,8.5) 
THEN 

AIR_INJ - 0 99.0% 

AIR_INJ = 1 .0% 
RULE_2 IF 

MAKE - AC, ACURA, ALFA, AM GE, AUSTL HYUND, LOTUS, 
MITSU, OPEL, PANTE, SAAB, SUBAR, SUNBE, SUZUK or VOLKS 

MGH_02 - [8.5,29.6] 
THEN 

AIR_INJ = 0 90.9% 

AIR_INJ-1 9.1% 
RULE_3 IF 

MAKE - AMC, BUICK, CHEVR, FIAT, FORD, GMC, ISUZU, JEEP, 
MAZDA, OLDSM, or TOYOT 

MODEL_YR = [55,74) 

HIGH_02 - [-5:4,0.2) 
THEN 

AIR_INJ - 0 72.7% 

AIR_INJ = 1 27.3% 
RULE_4 [F 

MAKE = AMC, BUICK, CHEVR, FIAT, FORD, GMC, ISUZU, JEEP, 
MAZDA, OLDSM, or TOYOT 

MODEL_YR - [55,74) 

HJGH_02 « [0.2,2.4) 

ODOMETER - [0,115496) 
THEN 

AIR_INJ = 0 95.3% 

AIR_INJ - 1 4.7% 
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and encodes the text-based rules into fields separated by 
opcodes that enable execution of the rules. It also extends 
the ranges of the values to ensure that the rule set is onto the 
vehicle population. 

Classifier formatting consists of two passes. Each of these 
is discussed in the following sections. 

During the first pass, a recursive descent parser reads in 
the rule file generated by CHAID and builds the encoded 
io classifier file. The layout of the classifier file is depicted in 
Table 2 below. Each rule layout starts with the Rule_Name 
and ends with End_ofL_Rule. 

TABLE 2 

15 

Rule Layout 

Rulc^Namc 
Conditioa_l 
Conditio n_2 


Condition_n 
End_of_Cond ition 
Failure Category 
Probability Failure Exists 
End_of_Rule 


Each rule may contain one or more conditions. Each 
condition consists of a predictor name and a range or list of 

30 predictor values. Predictors may be continuously-valued, 
such as those having real number values. They may also be 
categorical, such as the make predictor category. The last 
condition is denoted by the End_of_Condition field. If the 

35 condition is a Make Condition where Make represents any 
categorical predictor, then the last categorical value is fol- 
lowed by the End_of_Make field. Table 3 below depicts the 
layout of a condition. A Range_Condition always contains 
the Range__Condition opcode followed by four data fields. 

40 The bound types can be inclusive or exclusive. A Make_ 
Condition contains the Make_Condition opcode followed 
by one or more categorical field values. The End_ofL_Make 
files signifies the end of the categorical field values. 


TABLE 3 




Condition Layout 


Make_Condition 

Make_l 

Make_2 

Make__n End_o£__Make 

Rangc_Condi tion 

Lower Bound 

Lower Bound Upper Bound 

Upper Bound 


Type 

Value Type 

Value 


The rules are mutually exclusive and cover the entire 
training set, i.e., exactly one rule applies for each vehicle. 
CHAID parameters include the following: 

1. Automated runs are made using Cluster method. 

2. The filter menu value should be set to Prediction (the 
default), corresponding to a 5% maximum statisucal 
error level. 

3. The significance value should be set to adjusted (the 
default). 

4. Suggested Bonferroni adjustment setting is 3 based 
on relationship between input parameters odometer 
and modeL_year. 

CHAID rule files 78 are not suitable for use in the learning 
process. The classifier formatting process takes the rule file 


55 During the second pass, the range adjuster reads in the 
classifier field generated by the first pass and adjusts the 
upper and lower bounds of the conditions of the rules such 
that no vehicle in the population is ever out of bounds of all 
rules. 

The inputs to the Failure Analysis Accuracy Module 79 
60 are the test data file 73, the new classifier 81 for one or more 
failure categories, and the "Old Classifier M 82 for the failure 
categories. The function of the module 79 is to compare the 
accuracy of the new and old classifiers by applying both to 
a data set where the actual repairs are known. The output of 
65 the module 79 is a set of accuracy statistics and a set of 
classifiers that are the best classifiers for their respective 
failure categories. 
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The algorithm used for applying the classifier to the Number of Test Records 

training data set for each category is as follows: Changes made to Repair Action/Failure Category 

1. For each record in the test data set, obtain a prediction Variance of old and new classifiers for each category 
(probabitity value P) using the classifier lookup tunc- . . 4 A .. . 4 t 4 . 

4^ it _ * i c Notes on distribution to stations 

tion in the accuracy module. 5 L r . , , . . , 

a » • . , , ... • c i j . a / ii The pattern report 84 consists of either the graphical tree 

2. Assign the actual value of the repair field to A (recall , ^ . • J * 

Ais either 0 for no repair performed or 1). Multiply A °f the gpnanc rule file as it is output from the 

by 100 to convert to percentage so that it agrees with CHAID algorithm 76 (not the reformatted version) as 

units from classifier value. Then the variance for the desired b ? ^ rec ! ivm 8 P arties ' 0ne or mle me 18 

test set containing n records is: 10 S eDerated for each failure category which contributes to the 

new failure prediction classifier 81. A new tree or rule file 

i n need not be generated if it was found to be less accurate than 

i-» /=i the previous version. 

Three types of data files are routinely transferred from the 

3. Compare the variance for the old and new classifiers. 15 host t0 ** ****** ^ e Pf ccss ° r 26 (sec FIG 1) at each 
The best classifier is the one with the lowest variance, inspection state; the classifier table files on the learning 
i.e., agrees best with actual values. This step is based on process 68 and the frequency distribution file 56 and repair 
a one-tailed F test for large n. action/Failure category file 57 from the host computer 28. 

A copy of the best classifier 71 for each failure category 20 are at least two methods for doing this transfer, that is, 

is saved on the learning process 86 to supply the "Old floppy and network methods. 

Classifier" for the next learning process cycle. The best The network method is as follows. The files containing 

classifier for each failure category are combined into on the classifier tables are transferred via ftp. from the learning 

large classifier table 41 for transmission to the inspection process 68 to the host. Host files are transferred from the 

lane. 25 host to the lane 20 via established network communications. 

The failure analysis accuracy module 79 takes a single The second method of the file transfer is via floppy disk, 

instance of a vehicle record and, using a classifier table file, The data processing manager at the host in such a case 

produces a failure probability for one or more failure cat- wou ld copy the file to a floppy disk and transport that disk 

egories. The functions included are: to the station housing the lane processor 26. The floppy 

1. Only in the learning process there formatting of 30 method is a backup method in case the network is down, 
vehicles characteristics and test results to agree with The tree construction of FIG. 6 is an example that was 
predictor standards for values and units (this is already created min% California Smog Check Data obtained in 1991. 
done for the test dataset); and n& predictors (independent variables) are slightly different 

2. In both the learning process and on the lane there is the fr 0m a typical two-speed idle test, since 02 was measured 
selection or verification of proper classifier table file 35 ^ addition to HC, CO, and C02. The diagnostic report 
and classifier table file lookup. shows Wo failure categories. 

TTiese functions are performed on the learning process 68 Jree QUt , frQm showQ m rg. 6 is a tree 

to support the accuracy calculation, and in the inspection representation of relationships found during analysis of 

lane 20. THe formatting function is not required for the ^ ^ Mures fa ^ a ^ ^ Qf 

accuracy calculation, since test dataset used is already 40 , 1 j a u • a. a 

formatted samples was analyzed. As shown in the root node, 

As is done in the emissions inspection lane 20 (see FIG. 16M ° f ych } clts ^ brou & ht ^ emissions 

4*0, the classifier lookup function in the accuracy module aiK * after a * md "f™ re P air actlODS 

compares formatted data to rules in a classifier table. Here, ^ remainder of the vehicles were brought into compli- 

the lookup operation applies the rules to the formatted 45 ance after n V™ actions not relaled t0 air Motion repa i r 

vehicle record which results in a probability value P for each a ctions. Significant correlations were found between air 

classifier or failure category in the classifier table file. induction-related failure and 5 of the predictor categories 

For each failure category, compare the predictor values (independent variables): 
for the vehicle with the first condition rule. If the first MODEL_YEAR, 02_IDLE, HC_2500, HCJDLE, 
condition is met by the vehicle predictor value, compare the 50 AND CO2_2500. Each of the terminal or leaf nodes rep- 
predictor value for the second condition. If any condition is resents a subset of the original set, characterized by a unique 
not met, goon to the next rule. If all conditions are a rule of set of predictor values. These values were used to provide 
a rule are met, assign P to be the value in the rule and skip the sample repair diagnosis of FIGS. 3a-d 
all remaining rule for that failure category. The detail in the node shows the percentage of vehicles 
The Learning Process Log 83 is a record of all learning 55 without air induction-related failure, the percentage with air 
process activity. An entry is written to the log for each induction-related failure and the number of vehicles in the 
learning process performed. An entry contains: node. For example, for vehicles having a model year 
Processing Date — that learning process is performed between 1976 and 1991, 0 2 at idle between 4.9 and 29.6, 
Data collection dates— Range of test dates for data used and HC at idle greater than 346, the incidence of air 
Test Type— 2-speed, idle, loaded, I/M 240 60 Eduction failure was 32.5%. In this subset of the population, 
p .... . the incidence of this failure is nearly twice that in the entire 
bau-margin value used sample (16 0%) md more ^ ^ the rate of 

Source of Training Data— Program or zone name and me failure ^ 0 i de r vehicles with low 0 2 at idle (9.5%). 

range of emissions test dates ^ Uce output isxsscdasa pattern report. 

Number of Training Records 65 The generic rule output is a set of rules that represent the 

Source of Test Data — Program or zone name and range of leaf nodes of the tree. The tree shown in the example 

emissions test dates produces the following rule output: 
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RULE_1 IF 
MODEU.YEAR - [55,76) 
02_IDLE - [-5.2,4.9) 

THEN 

AIR_INDUCTION = 0 90.5% 

AIR_INDUCTION - 1 9.5% 
RULE_J2 IF 

MODEl^YEAR = [55,76) 

02 _JDLE - [4.9,6.7) 

HC_2500 - [-2,99) 
THEN 

AIR_INDUCnON - 0 55.6% 
AIR_INDUCnON = 1 44.4% 

RULE^ IF 

MODEL_YEAR = [55,76) 
02_IDLE - [4.9,6.7) 
HC_2500 - [99,3276] 

THEN 

AIR_INDUCnON - 0 86.3% 
AIRJNDUCTION = 1 13.7% 

RULE_4 IF 
MODEU_YEAR » [55,76) 
02_IDLE - (6.7,29.6] 

THEN 

AIR_INDUCnON - 0 93.3% 
AIR_INDUCnON - 1 6.7% 

RULE _5 IF 

MODEL_YEAR - [76,27] 
02_JDLE = [-5.2,0.8) 
HC_IDLE - [-12,76) 

THEN 

AIR_INDUCTION = 0 64.6% 
AIR_INDUCTION = 1 35.4% 

RULE_6 IF 

MODEL_YEAR - [76,27] 
02_JDLE - [-5.2,0.8) 
HC_2500 - [76,7505] 

THEN 

AIR_INDUCTION - 0 81.7% 
AIR_INDUCTION = 1 18.3% 

RULE__7 IF 

MODEL_YEAR ° [76,27] 
02_IDLE = [0.8,4.9) 
CO2_2500 = [2.3,15.8) 

THEN 

AIR_INDUCTION - 0 85.4% 
AIR_INDUCTION - 1 14.6% 

RULE_8 IF 

MODEL_YEAR - [76,27] 
02_JDLE - [0.8,4.9) 
CO2_2500 - [15.8,27.7] 

THEN 

AIR_INDUCnON - 0 57.9% 
AIR^INDUCTION = 1 42.1% 

RULE_9 IF 
MODEL_YEAR - [76,27] 
02_IDLE - [4.9,29.6] 
HC_IDLE - [-12,346) 

THEN 

AIR_INDUCTION « 0 81.4% 
AIR_INDUCnON - 1 18.6% 

RULE_10 IF 

MODEL_YEAR - [76,27] 
02_JDLB - [4.9,29.6] 
HC_IDLE - [346,7505] 

THEN 

AIR_INDUCTION - 0 67.5% 
AIR_JNDUCnON - 1 32.5% 
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As mentioned previously, the rule files are inputs to the 
classifier formatting module shown in FIGS. 3b and 3c. 

In summary, the learning process periodically analyzes 
the data and creates a knowledge base in the form of a new 
or revised classifier, which together with other classifiers 
form a classifier table. The learning process learns about 
vehicle emissions failures by studying examples of success- 
ful repairs. 

The analysis judges the effectiveness of repairs performed 
and looks for significant correlations between vehicle or test 


characteristics and successful repair actions. This knowl- 
edge is applied to predict the most likely causes of failure for 
a specific vehicle based on its characteristics and emissions 
test results including I/M 240 transient data. The learning 
5 process takes into account a variety of characteristics includ- 
ing vehicle make, model, year, engine size, mileage emis- 
sions test results including transient test data based on the 
modal analysis of the second -by-second data. The learning 
process produces the best indicators of failure from this 
10 multitude of characteristics and produces only useful con- 
clusions. Thus, this invention allows a large number of input 
parameters to be evaluated and the algorithm determines 
which ones correlate with failures. The best predictors for 
each failure are used and include several that have not been 
15 used before. 

By using the information on the effectiveness of actual 
repairs performed on vehicles and the results of retests, the 
learning process learns from successful repairs. By using a 
large number of successful cases and looking for statistically 
20 significant correlations between symptoms and repairs, the 
learning process screens out the effects of low-quality 
(usually random) repair actions. 

Preferably, the host will issue periodic administrative 
reports 54 and learning process log entries 83 sufficient to 
monitor the effectiveness of the diagnostic system and 
pattern reports to document vehicle test and repair trends 
found in the data. The frequency of report generation and 
learning process will be adjusted. 
The administrative reports 54 and learning process log 83 
30 include the number of diagnostic reports being issued, the 
principal measures used to generate them (e.g. OBD used), 
and their accuracy as documented by repair information 
gained on their reinspection. Analysis of diagnostic accuracy 
will show the distribution (e.g. make/model/year) over 
35 vehicle type in enough detail to monitor vehicle coverage. 
The pattern reports 84 identify trends observed in the data. 
These reports are useful to identify for example: significant 
numbers of similar vehicles that have failed the inspection or 
emissions test because of a common defect; vehicle failure 
40 rates by season or geographical region; vehicles with high 
multiple retest rates to identify failure types or vehicles that 
the repair industry has problems diagnosing. 

In the first section of this description of the invention, the 
repair report 46 generation was described. In the second 
45 section, the update process 68 and steps prior to that process 
has been described in detail. In this final section more detail 
is provided with respect to the emission inspection lane 20 
and the interaction with the reports which are generated 
upon an emissions test. 

Turning to FIG. 7, some of the elements shown in FIG. 5 
and FIG. 1 are shown in combination. As shown in both FIG. 
1 and FIG. 7, the diagnostic controller 27 serves as an 
interface between the classifier table 41 and the lane con- 
troller 27. Before describing the elements of FIG. 7 in detail, 
a summary is provided. 

The existing lane controller 27 performs the vehicle 
emissions test. The lane controller 27 also makes a request 
to the OBD module 93 to determine the status of the 
vehicle's on-board diagnostic system MIL (malfunction 
60 indicator light), when applicable, and downloads the OBD 
diagnostic trouble codes and sensor data resident in the OBD 
computer memory as a result of the malfunctions. The OBD 
module 93 may also access real-time sensor data generated 
by the vehicle OBD computer as an input to the diagnostic 
65 subsystem. 

The lane controller 27 issues a diagnosis request to the 
diagnostic controller 91 when a vehicle has failed the 
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inspection process. The request is accompanied by vehicle 
characteristics and test data needed as input to the failure 
analysis module 97. Under certain circumstances, selected 
vehicle OBD sensor data is input to the failure analysis 
module 97. 

The failure analysis module 97 formats the data for 
compatibility with the classifier table files 41, including 
computing derived values from I/M 240 second-by-second 
data where applicable. The failure analysis module 41 
performs a lookup in the appropriate classifier table file and 
retrieves a failure probability for each failure category to be 
diagnosed. The categories are ranked according to the prob- 
ability and input to the diagnostic integration module 98. 
The diagnostic integration module 98 reconciles the proba- 
bilistic results with recent repairs made to the vehicle (for 
those undergoing multiple test failures) and OBD diagnostic 
trouble codes retrieved from the vehicle (under circum- 
stances where OBD input is used). The integrated diagnosis 
is sent to the failure report module 99 for creation of the 
report. 

Here, symptoms associating this vehicle with each failure 
category are retrieved from the classifier table 41 and 
converted to an English-language phrase by the justification 
module In FIG. 7 some of the elements shown in FIG. 5 and 
FIG. 1 are shown in combination 103. Also, the relative 
incidence of failure in the general failure population for each 
failure category is retrieved from the failure frequency 
distribution file 56 for including in the failure report. 
Moreover, included in the failure report are the plots of the 
vehicle's emissions results compared with a typical passing 
vehicle and any OBD results obtained. The failure report 31 
and 32 is created and given to the motorist for use by his or 
her repair technician. 

Turning to the details of the features shown in FIG. 7, the 
OBD module is first discussed. Generally, the data gathered 
at the lane 20 such as diagnostic 92 and OBD 93 records are 
also uploaded to the host in order to recreate any failure 
report. The contents of the OBD record 93 shall include the 
diagnostic trouble codes and the frames of data from the 
OBD session. The diagnostic data 27 items include the 
version number of the classifier, failure categories, and 
associated probabilities of failure. These data items are sent 
to the host and stored in a database table on the host 
computer. 

Diagnostic records 92 are sent by the lane 20 via two 
transactions. One transaction sent by the lane is called the 
Put Diagnostic Rec and is identified by arrow 94. This 
transaction sends the diagnostic records in the host as 
identified by arrow 96. After the host receives the diagnostic 
records, the host sends a RSP_ACK to indicate that the data 
has been delivered. 

The diagnostic controller 91 serves as an interface 
between the classifier table 41 and operations with which it 
is associated and the lane 20 software 27. The diagnostic 
controller module 91 serves as an executive for the classifier 
table 41. 

The controller 27 will be implemented as an independent 
task. This task will be invoked when the vehicles gets to a 
position where the report needs to be generated. The diag- 
nostic controller 91 accepts a diagnostic request from the 
lane 20 controller 27 This request indicates that failure 
analysis should now be performed. 

The calling sequence of the controller 27 is as follows: 
upon receiving a diagnostic request, the controller invokes 
the failure analysis module 97, which reads in the necessary 
data and performs the failure analysis. The failure analysis 
module 97 invokes the diagnostic integration module 98, 
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which integrates information concerning repairs that have 
already been done for the current vehicle being tested and 
the OBD test results. The diagnostic integration module 98 
invokes the failure report module 99, which produces a 

5 failure report 31. 

This vehicle data used as input to the failure analysis 
module 97 is preferably formatted to ensure compatibility 
with the data in the classifier table file 41. Any necessary 
units conversion and formatting is performed upon the 

io vehicle data used as input to this module. 

One of the inputs to the failure analysis module is the 
second-by-second data produced by the I/M 240 test. This 
data, known as Block 3 (see above), is summed for 
acceleration, cruise, deceleration, and transient for each kind 

15 of exhaust gas. These computed values and the vehicle test 
records serves as input to the failure analysis 97. 

As discussed in detail above, the failure analysis module 
97 uses the classifier table 41 that was generated by the 
learning process (see FIG. 5). There is a separate classifier 

20 table for each type of test conducted. If a lane 20 uses both 
the two-speed idle test as well as the I/M 240, both classifier 
tables are stored. The failure analysis module checks a field 
known as the test_type in the vehicle test record transmitted 
by arrow 94 to determine whether the test under way is a 

25 two-speed idle or an I/M 240 test. The failure analysis 97 
implements a classifier lookup in order to execute a rule for 
a given failure category (see discussion above). The output 
of the table lookup is the rule name, the probability that a 
failure category is true, and the associated failure category. 

30 The rule name, failure category, and probability of the 
failure are stored in a failure probability table 101 for output 
shown as FIG. a-d. 

The failure probability table 101 includes the rule name 
and associated failure category and failure probability values 

35 are contained in this table. Functions such as init, add, 
delete, and sort are supplied as operations of the table. The 
sort operation sorts the table in descending order of failure 
probabilities. 

Turning to the OBD module 93, it maps diagnostic trouble 
40 codes observed onto the failure category file 57. Failure 
prediction is suppressed for those failure categories, similar 
to the multiple retest method. Other embodiments include 
suppression of a diagnostic assessment or the use of the 
retrieved OBD sensor data as inputs to the failure analysis 
45 module 97 for all vehicle supporting OBD. 

The OBD module 93 performs downloading of the data 
from the vehicle. If the vehicle supports OBD, a message is 
sent to the lane 20 inspector instructing him or her to inspect 
OBD MIL status and functionality. After the lane 20 inspec- 
50 tor connects the OBD cable to the vehicle and initiates 
downloading, OBD data is down loaded into a table on the 
lane 20 called OBD table 102. Two kinds of data are 
downloaded: diagnosis trouble codes and frames of sensor 
data. 

55 The integration function of the OBD module 93 involves 
mapping the OBD diagnostic trouble codes to failure cat- 
egories. Each diagnostic trouble code is mapped to a failure 
category in the table 57 in the following manner: for each 
diagnostic trouble code, a corresponding failure category is 

60 mapped in the adjacent column of the OBD table. A data 
table is available that associates all diagnostic trouble codes 
to failure category if one exists. 

The OBD module 93 operations supplies functions to 
access the OBD table 102. The diagnostic trouble code and 

65 associated failure category are found in this table. Functions 
such as init, add, delete are supplied for this table. Module 
93 also supplies functions to build an OBD record from 
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information in the OBD table and to send this record for failure generated by the failure analysis module 97. The 

storage in the OBD database. symptoms are the left hand side (depending upon format) of 

TVrning to the diagnostic integration module 98, this the applicable rule from the classifier table file (see above), 

module integrates the probabalistic failure analysis results To perform a justification, the first step is to determine 

with other sources of information available about the 5 which rule has been executed to produce the failure category 

vehicle. For vehicles undergoing a second (or greater and associated probability of failure. The second step is to 

number) retest, there is a possibility that data will conflict. look U P ^ose conditions on the left hand side of this rule and 

To avoid that possibility steps can be taken. convert those conditions into an English-like phrase that 

The results of a recent repair session are available on the i us J* es the Mure category diagnosis 

vehicle emissions repair report 32 surrendered to the lane 20 10 , ^.^^f ihc^^onmodyxlc99^mo^ 

. 0 ^, r%t - „, ,« rt „*L* n e*u* tl, , •„ i first the justification module gets the rule name from the 

inspection at the outset of the retest. The system display Mure / obabim table 102; * xt me classifier Mq file 41 

a menu on a screen consisting of a list of possible repair fe ^ { ^ fa found lhat matches ^ ^ Qame 

actions that are identical to those on the vehicle emissions M of me of that matching rule are used as input 

repair report 32. The lane 20 inspector will move the for the justification phrase pertaining to that rule, 

up/down arrow key to a repair action on the screen that 15 justification module 99 also supplies a translation 

matches a repair that has already been done for the current routine that translates the left hand side conditions derived 

vehicle and hit the enter key to select that repair. The repairs from the rule into an English-like phrase for purposes of 

that have been input by the lane 20 inspector are mapped to producing justification statements in the failure report, 

the failure categories used by the repair action/failure cat- To help the motorist whose vehicle failed the emissions 

egory file 57. If a particular vehicle has had one or more 20 test understand the diagnostic assessment relating to his or 

repair actions performed in a given failure category, that her vehicle, a preprinted brochure is given to that motorist, 

entry is deleted in the failure probability table 101 so that a The brochure explains the probabilistic approach taken and 

probabilistic prediction will not be made in the failure report the error expected. It also serves as a key to the failure 

32. categories used in the report and states which repair actions 

Another source of possibly conflicting information about 25 make up each failure category, 

the vehicle is the OBD data retrieved from the on-board Accordingly, in summary, the above detailed description 

computer. This is one possible means of integration. The has described the features of this invention including the 

diagnostic integration module invokes a function in the preparation of a diagnostic report with a diagnostic assess- 

OBD module that maps the OBD diagnostic trouble codes to m ! nt ** * veh ! cle own . er t0 » 10 \^ nn ^ or ^ 

corresponding categories. Each failure found in the OBD 30 ^ > t0 b 2?8 ^ emissions mto comphancc with emission 

table £ compared with the failures found in the failure Sta h nd ? rds * ™, at *■ . the , d ^ ostlc "f^P™ f ves f he 

, .... * + M _, . ' " " "r cTT vehicle owner's service technician probabilistic information 

probabihty table 101. If a match exists and the failure about the ^ causes of me * thi cWf> failure of the 

probability of the matching failure in the failure probabihty emissions test. The description also has shown how the 

table 101 is less that some threshold probability value, then diagnosis is derived from operations involving a classifier 

that failure is deleted from the failure probability table. 35 Mc which stores previously derived rules which form the 

The diagnostic integration module 98 also builds a diag- basis for the prediction of the diagnostic assessment. Also, 

nostic record from the information contained in the failure included is a detailed description of how an updated clas- 

probability table as well as the version number of the sifier table is generated where a vehicle which previously 

classifier. The diagnostic record is sent to the host 28 for failed the emission test finally passes and how the informa- 

storage in a data base. 40 tion relating to the passing test is used to update the classifier 

Turning to the failure report module 99, when generated, table, 

it includes all diagnostic information obtained. A public We claim: 

domain plotting package is used to construct the failure !• A method for generating a diagnosis of a vehicle's 

report. cause of failure of an emissions test, comprising the steps of: 

The failure categories and the corresponding probabilities 45 * n a storage media, storing a classifier table composed of 

are retrieved from the failure probability table 101. The a set of rules; 

failure categories and failure probabilities are shown ranked in an emissions testing facility, receiving and storing 

from most likely to least likely. Also given is the frequency vehicle characteristics signals; 

with which this failure category occurs in the failed vehicle in said emissions testing facility, sampling emissions of 

general population (retrieved from the failure frequency 50 sa ^ vehicle to create emission test results to generate 

distribution file 56). For each failure category in the failure a ^es of emission test signals; 

report, there is a justification stating the vehicle's symptoms transmitting to a processor said emissions test signals; 

that are associated with that category. The justification is transmitting to said processor said vehicle characteristics 

supplied by the justification module 103 and will be in the signal; and 

form of an English-like phrase. Moreover, OBD codes and 55 comparing said emission test signals and said vehicle 

data are also read from the OBD table 102 and inserted into characteristics signals to said classifier table's set of 

the failure report. rules in a manner in which forms said diagnoses of said 

The failure report module 99 also plot the emissions test vehicle's cause of failure and generating a prediction 

values for CO, HC, Co 2 , and NO^ from the vehicle's l/M report thereof. 

240 test against a set of typical passing values if the I/M 240 60 2. A method as recited in claim 1 further comprising the 

was performed (see FIG. 2). The vehicle's test values and step of matching said prediction report with a failure fre- 

typical passing values are available at the lane 20. Both of quency distribution file to generate a diagnostic assessment 

these files are used as input to a plotting package to plot out report including failure probabilities, 

the driving trace and typical passing values. The graph of the 3. A method as recited in claim 2 further comprising the 

driving trace is plotted, for example, in terms of grams per 65 step of transmitting said diagnostic assessment report to a 

miles vs. second. Turning to the justification module 103, its host computer where said diagnostic assessment report is 

role is to supply the symptoms that justify the probabihty of stored. 
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4. A method as recited in claim 1 further comprising the 
step of periodically updating said classifier table. 

5. A method as recited in claim 1 wherein data is gathered 
relating to a plurality of vehicles which had failing emis- 
sions tests results and for which a prediction report was 5 
generated and wherein eacb of said plurality of vehicles has 
its own vehicle characteristics and also has passing emis- 
sions test results and a repair report, said method further 
comprising the steps of: 

receiving said vehicle characteristics signals, said failing 10 
emission test signals and said passing emission test 
signals of said plurality of vehicles; 

dividing said vehicle characteristics signals and said fail- 
ing and passing emission test signals of said plurality of 
vehicles to generate a training dataset and a testing 15 
dataset; and 

correlating said plurality of repair reports with said train- 
ing dataset to form new rules; 
from said new rules, forming new classifiers. 2Q 

6. A method as recited in claim 5 wherein prior to said 
dividing step, said method further comprising the steps of: 

filtering said vehicle characteristics signals and failing 
and passing emission test signals of said plurality of 
vehicles to remove certain data; 25 

formatting said vehicle characteristics signals and failing 
and passing emission test signals of said plurality of 
vehicles; and 

weighting said vehicle characteristics signals and failing 
and passing emission test signals of said plurality of 30 
vehicles. 

7. A method as recited in claim 5 further comprising the 
steps of: 

processing said testing dataset and said classifier table to 
form first output; 35 

processing new classifiers to form second output; 

comparing first output with said second output to generate 
a set of updated classifier rules; 

forming an updated classifier table from said updated 
classifier rules. 40 

8. A system for generating a diagnoses of a vehicle's 
cause of failure of an emissions test, said vehicle having 
vehicle characteristics which forms input to said system in 
the form of a vehicle characteristic signal, comprising of: 

stored in a storage media, a classifier table composed of 45 
a set of rules; 

in an emissions testing facility, a receiving component for 
receiving said vehicle characteristics signal; 

in said emissions testing facility, an emissions sampling JQ 
apparatus to sample the emissions of said vehicle to 
create emission test results and which generates a series 
of emission test signals; 

a transmitter for transmitting to a processor said emissions 
test signals; 55 

a transmitter for transmitting to said processor said 
vehicle characteristics signal; 

a comparator for comparing said emission test signals and 
said vehicle characteristics signal to said classifier 
table's set of rules in a manner in which forms said 60 
diagnosis of said vehicle's cause of failure and gener- 
ating a prediction report thereof. 

9. A system as recited in claim 8 wherein said system also 
stores a frequency distribution file, said system further 
comprising: 65 

a matching component which matches said prediction 
report with said failure frequency distribution file to 


generate a diagnostic assessment report including fail- 
ure probabilities. 

10. A system as recited in claim 9 further comprising a 
transmitter for transmitting said diagnostic assessment 
report to a host computer where it is stored. 

11. A system as recited in claim 8 wherein said classifier 
table is periodically updated. 

12. A system as recited in claim 8 further comprising a 
plurality of vehicles which had failing emissions tests results 
and for which a prediction report was generated and wherein 
each of said plurality of vehicles has its own vehicle 
characteristics and also has passing emissions test results 
and a repair report, said system further comprising: 

a receiver which receives said vehicle characteristics 
signals, said failing emission test signals and said 
passing emission test signals of said plurality of 
vehicles; 

a division component which divides said vehicle charac- 
teristics signals and said failing and passing emission 
test signals of said plurality of vehicles to generate a 
training dataset and a testing dataset; and 

a correlation component which correlates said plurality of 
repair reports with said training dataset to form new 
rules and therefrom, new classifiers. 

13. A system as recited in claim 12 further comprising: 
a filter which filters said vehicle characteristics signals 

and failing and passing emission test signals of said 
plurality of vehicles so that said signals meet selection 
criteria; 

a formatting component which formats said vehicle char- 
acteristics signals and failing and passing emission test 
signals of said plurality of vehicles to be acceptable to 
said correlation component; and 

a weight component which weights said vehicle charac- 
teristics signals and failing and passing emission test 
signals of said plurality of vehicles so that each of 
signal is appropriated scaled. 

14. A system as recited in claim 12 further comprising: 
a processor which processes said testing dataset and said 

classifier table to form first output; 

a processor which processes new classifiers to form 
second output; and 

a comparator which compares first output with said sec- 
ond output to generate a set of updated classifier rules 
to form an updated classifier table. 

15. A system for generating a diagnosis of a vehicle's 
cause of failure of an emissions test, comprising: 

a storage media configured to store data base including a 
classifier table's set of rules; 

a processor configured to receive input representing 
vehicle characteristics signals; 

a processor configured to receive input representing emis- 
sion test signals; 

a processor configured to compare said emission test 
signal and said vehicle characteristics signal to said 
classifier's table's set of rules in a manner in which 
forms said diagnosis of said vehicle's cause of failure 
and generating a prediction report thereof. 

16. A system as recited in claim 15 further comprising: 
a storage media configured to store a failure frequency 

distribution file; 
a processor configured to match said prediction report 
with said failure frequency distribution file to generate 
a diagnostic assessment report including failure prob- 
abilities. 
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17. A system as recited in claim 15 wherein said storage 
media is further configured to store a data base including an 
updated classifier table's set of rules. 

18. A method for providing a system for generating a 
diagnosis of a vehicle's cause of failure of an emissions test, 
comprising: 

providing a storage media configured to store data base 
including a classifier table's set of rules; 

providing a processor configured to receive input repre- 
senting vehicle characteristics signals; 

providing a processor configured to receive input repre- 
senting emission test signals; 

providing a processor configured to compare said emis- 
sion test signal and said vehicle characteristics signal to 
said classifier's table's set of rules in a manner in which 
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forms said diagnosis of said vehicle's cause of failure 
and generating a prediction report thereof. 

19. A method as recited in claim 18 further comprising the 
steps of: 

providing a storage media configured to store a failure 
frequency distribution file; and 

providing a processor configured to match said prediction 
report with said failure frequency distribution file to 
generate a diagnostic assessment report including fail- 
ure probabilities. 

20. A method as recited in claim 18 further comprising the 
step of: 

storing on said storage media an updated classifier table's 
set of rules. 
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